Context-dependent sound event detection

نویسندگان

  • Toni Heittola
  • Annamaria Mesaros
  • Antti J. Eronen
  • Tuomas Virtanen
چکیده

The work presented in this article studies how the context information can be used in the automatic sound event detection process, and how the detection system can benefit from such information. Humans are using context information to make more accurate predictions about the sound events and ruling out unlikely events given the context. We propose a similar utilization of context information in the automatic sound event detection process. The proposed approach is composed of two stages: automatic context recognition stage and sound event detection stage. Contexts are modeled using Gaussian mixture models and sound events are modeled using three-state left-to-right hidden Markov models. In the first stage, audio context of the tested signal is recognized. Based on the recognized context, a context-specific set of sound event classes is selected for the sound event detection stage. The event detection stage also uses context-dependent acoustic models and count-based event priors. Two alternative event detection approaches are studied. In the first one, a monophonic event sequence is outputted by detecting the most prominent sound event at each time instance using Viterbi decoding. The second approach introduces a new method for producing polyphonic event sequence by detecting multiple overlapping sound events using multiple restricted Viterbi passes. A new metric is introduced to evaluate the sound event detection performance with various level of polyphony. This combines the detection accuracy and coarse time-resolution error into one metric, making the comparison of the performance of detection algorithms simpler. The two-step approach was found to improve the results substantially compared to the context-independent baseline system. In the block-level, the detection accuracy can be almost doubled by using the proposed context-dependent event detection.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sound Event Detection and Context Recognition

Humans can easily segregate and recognize one sound source from an acoustic mixture, and recognize a certain voice from a busy background which includes other people talking and music. Sound event detection and classification aims to process an acoustic signal and convert it into descriptions of the corresponding sound events present at the scene. This is useful, e.g., for automatic tagging in ...

متن کامل

Neural representations of auditory input accommodate to the context in a dynamically changing acoustic environment.

The auditory scene is dynamic, changing from 1 min to the next as sound sources enter and leave our space. How does the brain resolve the problem of maintaining neural representations of the distinct yet changing sound sources? We used an auditory streaming paradigm to test the dynamics of multiple sound source representation, when switching between integrated and segregated sound streams. The ...

متن کامل

Context-Dependent Event Detection in Sensor Networks

Event-based systems are well suited for application in sensor networks. Compared to the traditional application domains of event-based systems however sensor networks impose a number of challenging requirements. In this short paper, we present an architecture designed to address these requirements: the CoDED platform for context-dependent event detection.

متن کامل

Sound Event Detection for Real Life Audio DCASE Challenge

We explore logistic regression classifier (LogReg) and deep neural network (DNN) on the DCASE 2016 Challenge for task 3, i.e., sound event detection in real life audio. Our models use the Mel Frequency Cepstral Coefficients (MFCCs) and their deltas and accelerations as detection features. The error rate metric favors the simple logistic regression model with high activation threshold on both se...

متن کامل

Neurophysiological evidence for context-dependent encoding of sensory input in human auditory cortex.

Attention biases the way in which sound information is stored in auditory memory. Little is known, however, about the contribution of stimulus-driven processes in forming and storing coherent sound events. An electrophysiological index of cortical auditory change detection (mismatch negativity [MMN]) was used to assess whether sensory memory representations could be biased toward one organizati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Audio, Speech and Music Processing

دوره 2013  شماره 

صفحات  -

تاریخ انتشار 2013